Pooling versus model selection for nowcasting with many predictors: an application to German GDP

نویسندگان

Vladimir Kuzin

Massimiliano Marcellino

Christian Schumacher

Heinz Herrmann

Thilo Liebig

Karl-Heinz Tödter

چکیده

This paper discusses pooling versus model selection for nowand forecasting in the presence of model uncertainty with large, unbalanced datasets. Empirically, unbalanced data is pervasive in economics and typically due to di¤erent sampling frequencies and publication delays. Two model classes suited in this context are factor models based on large datasets and mixed-data sampling (MIDAS) regressions with few predictors. The specication of these models requires several choices related to, amongst others, the factor estimation method and the number of factors, lag length and indicator selection. Thus, there are many sources of mis-specication when selecting a particular model, and an alternative could be pooling over a large set of models with di¤erent specications. We evaluate the relative performance of pooling and model selection for nowand forecasting quarterly German GDP, a key macroeconomic indicator for the largest country in the euro area, with a large set of about one hundred monthly indicators. Our empirical ndings provide strong support for pooling over many specications rather than selecting a specic model. Keywords: nowcasting, forecast combination, forecast pooling, model selection, mixedfrequency data, factor models, MIDAS JEL-Classication: E37, C53 Non-technical summary In this paper, we evaluate the empirical performance of new short-term forecasting methods with respect to nowand forecasting of German GDP. In general, forecasting in real-time is subject to considerable uncertainty, and in our forecast exercise, we particularly account for two types of uncertainty: the uncertainty regarding the choice of the appropriate forecasting model and the uncertainty about the relevant business cycle indicators to be included in the model. In our paper, we consider forecast pooling methods to tackle both sources of forecast uncertainty. In the empirical literature, forecast combinations are considered as useful forecast tools, as they can insure against choosing an inappropriate single model by sharing the risk of model mis-specication between many models. In an empirical forecast comparison, we compare pooling to alternative methods of model selection for forecasting. We employ two alternative classes of econometric models to compute nowand forecasts: factor models based on large datasets and mixed-data sampling (MIDAS) regressions based on a few predictors. To evaluate the impact of mis-specication on the forecast accuracy, we compare expost and ex-ante forecasts. Ex-post forecasts are based on xed model specications that have been selected after inspecting their performance in a recursive comparison. Ex-ante forecasts, however, are based on models that have been specied without referring to forecast errors that are only known ex post. Thus, the ex-ante forecasts are better suited for a more realistic assessment of the models performance. The expost forecasts provide stylised results based on optimised model structures that are not subject to model uncertainty. Thus, a comparison between ex-post and ex-ante forecasts isolate the e¤ect of mis-specication on the forecast performance. An novel aspect of the current paper compared to the existing literature on forecast pooling is the explicit and model-consistent consideration of unbalanced datasets. In short-term forecasting exercises, there are often two relevant phenomena that lead to unbalanced datasets: rst, the di¤erent sampling frequencies of the data, and, second, the missing observations at the end of the sample due to di¤erent publication lags, the so-called ragged edgein multivariate data. For example, interest rates are typically observed at higher frequency and much more timely than variables like GDP or other national accounts data. Short-term forecasts often refer to current-quarter forecasts and forecasts onequarter ahead. In spite of these relatively short forecast horizons, the forecasts are subject to considerable uncertainty. One important reason for this is that the information content of forecasts from a particular model is often not constant over time due to structural instabilities, which is a common nding from the literature. Hence, it can be the case, that a model performs well in a particular evaluation period, but performs worse in another evaluation period after a structural break has occurred. One way to tackle this problem is by means of forecast pooling, which implies constructing a combined forecast from the output of a set of di¤erent forecasting models. An alternative to pooling is model selection based on statistical information criteria. The empirical ndings for German GDP show the existence of many particular models and leading indicators that perform very well on an ex post basis. However, this holds only if the optimal model structure and relevant leading indicator is known, that is the framework of ex-post forecasts. In the case of ex-ante forecasts, without knowledge regarding the optimal model structure, the forecasting performance deteriorates dramatically when model selection based on information criteria is employed. On the contrary, forecast pooling performs well overall. Although some of the individual best-performing models do better than the combinations, the majority of single models is generally outperformed. Furthermore, the forecasting power of single leading indicators and models turned out to change over time, whereas forecast combinations were stable overall. These results suggest that forecast pooling is a reliable and robust tool for short-term forecasting of macroeconomic activity. Nicht-technische Zusammenfassung Im vorliegenden Beitrag wird untersucht, wie gut neuere Kurzfristprognoseverfahren die Entwicklung des deutschen Bruttoinlandsprodukts (BIP) vorhersagen können. Dabei wird berücksichtigt, dass Unsicherheit sowohl bezüglich der Auswahl der Form des geeigneten Prognosemodells besteht, als auch hinsichtlich der Auswahl der zu berücksichtigenden makroökonomischen Variablen, welche Informationen über die künftige Wirtschaftsentwicklung liefern sollen. Um diese Unsicherheiten bei der Prognoseerstellung zu berücksichtigen, werden in diesem Beitrag alternative Verfahren der Prognosekombination (forecast pooling) angewendet. Kombinationen von Prognosen streuen das Risiko von Fehlspezikationen einzelner Modelle und haben sich in der Literatur als vielversprechende Prognoseinstrumente etabliert. In einem empirischen Prognosevergleich werden die Prognosekombinationen mit den Vorhersagen einzelner Modelle verglichen, wobei die Auswahl des geeigneten Modells als auch der relevanten Prediktoren mit unterschiedlichen Ansätzen erfolgt. Für die vorliegende Analyse werden zwei alternative Klassen ökonometrischer Modelle aus der jüngeren Literatur herangezogen: große Faktormodelle mit großen Datensätzen und Modelle auf Basis des sog. MIDASRegressionsansatzes mit wenigen Prediktoren. Um den Einuss von Fehlspezikationen auf das Prognoseergebnis bei diesen Modellklassen zu evaluieren, vergleicht das Papier die Ergebnisse auf der Basis von ex-post und ex-ante Prognosen. Ex-post Prognosen basieren auf xen Modellspezikationen, die nach Durchführung eines rekursiven Prognosevergleichs anhand ihrer dort erreichten Prognoseleistung ausgewählt wurden. Ex-ante Prognosen basieren hingegen auf Modellen, welche ohne Rückgri¤ auf lediglich ex post bekannte Prognoseergebnisse speziziert werden und daher für eine realistische Beurteilung unter Modellunsicherheit angemessener sind. Die ex-post Prognosen zeigen idealisierte Ergebnisse auf Basis einer optimierten Modellstruktur ohne Modellunsicherheit bei Kenntnis der Prognosefehler, so dass ein Vergleich zwischen ex-post und ex-ante Prognosen den Einuss von Fehlspezikationen aufzeigt. Im Vergleich zu anderen Arbeiten auf dem Gebiet der Prognosekombination berücksichtigt die vorliegende Arbeit explizit und modellkonsistent, dass bei der Prognose Daten üblicherweise "unbalanciert" zur Verfügung stehen: Insbesondere weisen die verwendeten Daten unterschiedliche Frequenzen auf und sind am aktuellen Rand wegen Publikationsverzögerungen nur unvollständig verfügbar (ragged-edge Problematik). Beispielsweise sind Zinssätze oder andere Finanzmarktdaten mit höherer Frequenz und wesentlich früher verfügbar als die viele Daten der volkswirtschaftlichen Gesamtrechnung. So ist das BIP nur als Quartalsangabe und mit erheblicher Zeitverzögerung verfügbar. Die Kurzfristprognosen beziehen sich auf das laufende oder das folgende Quartal. Trotz dieses kurzen Prognosezeitraums sind sie meist mit erheblichen Unsicherheiten verbunden. In Prognosevergleichen tritt nämlich aufgrund von strukturellen Instabilitäten oftmals der Fall ein, dass ein Vorhersagemodell keine beständig gute Prognoseleistung erbringt, also in einer bestimmten Prognoseperiode relativ gut im Vergleich zu anderen Modellen abschneidet und infolge von Strukturbrüchen relativ schlecht in anderen Perioden. Durch die Kombination unterschiedlicher Prognosemodelle versucht man dieses Problem zu mindern. Alternativ können statistische Informationskriterien verwendet werden um einzelne Prognosemodelle auszuwählen. In der empirischen Anwendung für das deutsche BIP zeigt sich, dass durchaus eine Vielzahl von Einzelmodellen und Frühindikatoren mit beachtlicher Prognosegüte gefunden werden können. Dies gilt jedoch nur bei Kenntnis der optimalen Modellstruktur und der relevanten Konjunkturindikatoren als Prediktoren, d.h., bei ex-post Prognosen. Bei ex-ante Prognosen, also wenn die optimale Struktur des Prognosemodells nicht bekannt ist und beispielsweise mit Informationskriterien bestimmt werden muss, nimmt die Prognosegüte der Einzelmodelle aber dramatisch ab. Dagegen liefern die Prognosekombinationen gute Ergebnisse unter ex-ante Bedingungen. Zwar können die kombinierten Prognosen die besten ex-post ausgewählten Einzelmodelle in der Regel nicht übertre¤en, jedoch liegen ihre Prognosefehler deutlich unter der großen Mehrzahl der meisten Einzelmodelle. Ferner zeigt sich, dass die Prognosegüte einzelner Konjunkturindikatoren im Zeitablauf schwankt, während die Kombinationen stabile Ergebnisse aufweisen. Diese Ergebnisse legen den Schluss nahe, dass kombinierte Prognosen als nützlich für Kurzfristprognosen anzusehen sind.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Methods for Pastcasting, Nowcasting and Forecasting Using Factor-MIDAS with an Application to Real-Time Korean GDP *

We discuss a variety of recent methodological advances that can be used to estimate mixed frequency factor-MIDAS models for the purpose of pastcasting, nowcasting, and forecasting. In order to illustrate the uses of this methodology, we introduce a new real-time Korean GDP dataset, and carry out a series of prediction experiments, using a two step approach. In a first step, we estimate common l...

متن کامل

Bayesian Variable Selection for Nowcasting Economic Time Series

We consider the problem of short-term time series forecasting (nowcasting) when there are more possible predictors than observations. Our approach combines three Bayesian techniques: Kalman filtering, spike-and-slab regression, and model averaging. We illustrate this approach using search engine query data as predictors for consumer sentiment and gun sales.

متن کامل

Nowcasting with Numerous Candidate Predictors

The goal of nowcasting, or “predicting the present,” is to estimate up-to-date values for a time series whose actual observations are available only with a delay. Methods for this task leverage observations of correlated time series to estimate values of the target series. This paper introduces a nowcasting technique called FDR (false discovery reduction) that combines tractable variable select...

متن کامل

Application of Multi-channel 3D-cube Successive Convolution Network for Convective Storm Nowcasting

Convective storm nowcasting has attracted substantial attention in various fields. Existing methods under a deep learning framework rely primarily on radar data. Although they perform nowcast storm advection well, it is still challenging to nowcast storm initiation and growth, due to the limitations of the radar observations. This paper describes the first attempt to nowcast storm initiation, g...

متن کامل

An Application of Fuzzy TOPSIS Method for Plant Selection in Rangeland Improvement (Case Study: Boroujerd Rangeland, Lorestan Province, Iran)

Species selection based on a new method such as a fuzzy method is one of the most important stages in the successful plantation management planning as choosing a suitable species for the site can be the key to success. This paper is based on a fuzzy extension of the Technique or Order Preference which is similar to Ideal Solution (TOPSIS) method. The purpose of this paper is to develop fuzzy TO...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Pooling versus model selection for nowcasting with many predictors: an application to German GDP

نویسندگان

چکیده

منابع مشابه

Methods for Pastcasting, Nowcasting and Forecasting Using Factor-MIDAS with an Application to Real-Time Korean GDP *

Bayesian Variable Selection for Nowcasting Economic Time Series

Nowcasting with Numerous Candidate Predictors

Application of Multi-channel 3D-cube Successive Convolution Network for Convective Storm Nowcasting

An Application of Fuzzy TOPSIS Method for Plant Selection in Rangeland Improvement (Case Study: Boroujerd Rangeland, Lorestan Province, Iran)

عنوان ژورنال:

اشتراک گذاری